MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors
نویسندگان
چکیده
The intrinsic advantages of whole-word acoustic modeling are offset by the problem of data sparsity. To address this, we present several parametric approaches to estimating intra-word phonetic timing models under the assumption that relative timing is independent of word duration. We show evidence that the timing of phonetic events is well described by the Gaussian distribution. We explore the construction of models in the absence of keyword examples (dictionary-based), when keyword examples are abundant (Gaussian mixture models), and also present a Bayesian approach which unifies the two. Applying these techniques in a point process model keyword spotting framework, we demonstrate a 55% relative improvement in performance for models constructed from few examples.
منابع مشابه
Text-to-speech inspired duration modeling for improved whole-word acoustic models
In the construction of whole-word acoustic models, we have previously demonstrated substantial gains by using MAP estimation to introduce a simple prior model of phonetic timing. Based solely on the word’s phonetic (dictionary) pronunciation, this simple model included no information about the individual durations of constituent phones. However, the problem of modeling segmental duration has lo...
متن کاملDeveloping 3 dimensional model for estimation of acoustic power in urban pathways in geo-spatial information system framework
Around the word, traffic growth is causing growing air and noise pollution. Noise levels in a given area are affected by traffic on the streets as well as effective factors, including existing infrastructure and industrial centers, and so on. The purpose of this research is to model and estimate the amount of acoustic emission in the streets of Tehran's third district, using the 3D spatial info...
متن کاملAudio Scene Understanding using Topic Models
This paper introduces a method to apply the topic models in an audio scene understanding framework. Assuming that an audio signal consists of latent topics that generate acoustic words describing an audio scene, we propose to use a vector quantization method to build an acoustic word dictionary. The classification experiments with semantic labels yield promising results of using the topic model...
متن کاملFabricating conversational speech data with acoustic models: a program to examine model-data mismatch
We present a study of data simulated using acoustic models trained on Switchboard data, and then recognized using various Switchboard-trained acoustic models. When we recognize real Switchboard conversations, simple development models give a word error rate (WER) of about 47 percent. If instead we simulate the speech data using word transcriptions of the conversation, obtaining the pronunciatio...
متن کاملReal-time spontaneous Ukrainian speech recognition system based on word acoustic composite models
This paper describes implementation of methods and algorithms for the automatic speech recognition based on word composition proceeding from acoustic phoneme models. Such a design of the speech-to-text decoder is conventional and most productive for Western languages. The aim is to explore this approach applied to the Ukrainian language that is highly inflective with relatively free word order....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012